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Machine learning-based modeling of reactor physics problems has attracted increasing interest in recent years. 
Despite some progress in one-dimensional problems, there is still a paucity of benchmark studies that are easy 
to solve using traditional numerical methods albeit still challenging using neural networks for a wide range 
of practical problems. We present two networks, namely the Generalized Inverse Power Method Neural Net- 
work (GIPMNN) and Physics-Constrained GIPMNN (PC-GIPIMNN) to solve K-eigenvalue problems in neu- 
tron diffusion theory. GIPMNN follows the main idea of the inverse power method and determines the lowest 
eigenvalue using an iterative method. The PC-GIPMNN additionally enforces conservative interface condi- 
tions for the neutron flux. Meanwhile, Deep Ritz Method (DRM) directly solves the smallest eigenvalue by 
minimizing the eigenvalue in Rayleigh quotient form. A comprehensive study was conducted using GIPMNN, 
PC-GIPMNN, and DRM to solve problems of complex spatial geometry with variant material domains from 
the field of nuclear reactor physics. The methods were compared with the standard finite element method. The 
applicability and accuracy of the methods are reported and indicate that PC-GIPMNN outperforms GIPMNN 


and DRM. 


Keywords: Neural network, Reactor physics, Neutron diffusion equation, Eigenvalue problem, Inverse power method 


I. INTRODUCTION 


In the nuclear engineering domain, the fundamental mode 
solution of the K-eigenvalue problem based on the steady- 
state multigroup neutron diffusion theory is crucial for simu- 
lation and analysis of nuclear reactors. The eigenvalue equa- 
tion can be expressed as follows: 
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r in the g-th energy group, D,, Er, Egg Xg and vos, de- 
note spatially dependent (possibly discontinuous) parameters 
that reflect the material properties in a reactor core [1]. Nu- 
clear engineers and analysts must numerically determine the 
fundamental mode eigenvalue (commonly called Keff) and 
the corresponding eigenvector for a given geometry/material 
configuration. For numerical solution, (1) must be discretized 
and reduced to a set of G-coupled algebraic equations, which 
can be expressed using matrices as follows: 
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This is often termed as the generalized eigenvalue problem 
because coefficient matrices occur on both sides of the equa- 
tion. Many mature numerical methods, such as the finite 
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difference method [2], nodal collocation method [3], finite- 
element method [4-6], and nodal expansion method [7-9], 
have been proposed to solve neutron diffusion equations. 
Among the methods, the nodal expansion method is widely 
used because it is easier to implement and requires less com- 
putational effort than other methods. The power iteration 
method is the most well-known numerical method for solving 
the principal K-eigenvalue [10]. A detailed review of con- 
ventional numerical methods for solving K-eigenvalue prob- 
lems is available in recent nuclear reactor physics textbooks 
[1, 11, 12]. Although traditional and mature numerical meth- 
ods are currently widely used in nuclear reactor physics to 
solve K-eigenvalue problems with acceptable engineering ac- 
curacy and computational cost, state-of-the-art neural net- 
works to solve K-eigenvalue problems are still in the infancy 
stages. However, neural networks exhibit the potential to pro- 
vide another option to solve nuclear engineering problems 
with the progress of new algorithms and hardware. 

As mentioned above, many traditional numerical methods 
were proposed to solve neutron diffusion equations. How- 
ever, a dense mesh is required to ensure highly accurate re- 
sults. The model consumes more computational resources 
if the mesh is extremely dense. Inaccurate solutions are ob- 
tained if the mesh is coarse. Moreover, for high-dimensional 
problems, classical methods are either less efficient or not 
successful due to the problems involving dimensionality. A 
neural network can potentially enhance its performance by 
using a capable hypothesis space due to its relatively low sta- 
tistical error [13]. 

Neural networks exhibit multiple potential benefits in solv- 
ing nuclear engineering problems when compared with con- 
ventional numerical methods. 


e They provide mesh-free solutions to approximate 
physics fields in nuclear reactors, in which mesh gener- 
ation is significantly complex due to high heterogeneity 
of geometry and material. 


e They provide a general framework to solve high- 
dimensional problems governed by parameterized 
PDEs, particularly for original neutron transport equa- 
tion.This equation comprises seven variables: three 
spatial variables, two directional variables, one energy 
variable, and one temporal variable. 


They are able to seamlessly incorporate prior data (po- 
tentially with noise) into existing algorithms given that 
data assimilation is necessary as a post-process for con- 
ventional numerical methods [14-19]. 


They provide a general framework to solve inverse 
problems with hidden physics. Conversely, they are 
typically prohibitively expensive and require different 
formulations and elaborate computer codes for conven- 
tional numerical methods. 


In recent years, neural networks have been widely used to 
solve Partial Differential Equations (PDEs) [20] and achieved 
remarkable success. 

Based on neural networks, a large number of methods were 
proposed to solve PDE, such as the deep backward stochastic 
differential equation (BSDE) method [21, 22], deep Galerkin 
method (DGM) [23], deep Ritz method (DRM) [24], and 
Physics-Informed Neural Network (PINN) [25]. The deep 
BSDE method reformulates PDEs using backward stochastic 
differential equations and approximates the gradient of an un- 
known solution using neural networks. Although DGM and 
PINN appear independently under two names in the litera- 
ture, they are similar. They both train neural networks by 
minimizing the mean squared error loss of the residual equa- 
tion. However, DRM reformulates the original problem into 
a variational problem and trains neural networks by minimiz- 
ing the energy function of the variational problem. Specif- 
ically, DRM and PINN [25] attracted widespread attention. 
Extensive studies focused on these methods to solve a vari- 
ety of problems, including fiber optics [26], hyperelasticity 
[27], solid mechanics [28], heat transfer problems [29-31], 
inverse problems [32-35], and uncertainty quantification [36—- 
39]. However, a few studies focused on solving eigenvalue 
problems [24, 40-45]. 

The need to solve eigenvalue problems can be traced back 
to 2018 [24]. A deep Ritz method to solve variational prob- 
lems was proposed, and several examples elucidated on how 
to use DRM to solve eigenvalue problems [24]. First, the 
original eigenvalue problem was transformed into a varia- 
tional problem. Then, a specially defined loss function was 
constructed, termed as the Rayleigh quotient, using the varia- 
tional principle [46]. The Rayleigh quotient is a well-known 
approximation of the eigenvalue of matrix A, which is de- 


fined by R = z As, Finally, they minimized the loss func- 
tion and obtained the smallest eigenvalue. Similarly, some 
studies [41, 42] directly used PINN to solve eigenvalue prob- 
lems. In contrast to DRM transferring the original problem to 
a variational problem, the PINN solves eigenvalue problems 
without variation. Neural networks are used in PINN to rep- 


resent the function, and automatic differentiation (AD) [47] 


is used to acquire the vector impacted by the differential op- 
erators. The loss function is obtained using the Rayleigh quo- 
tient. The smallest eigenvalue and corresponding eigenvec- 
tor were then solved using optimization tools. Moreover, the 
deep forward-backward stochastic differential equation (FB- 
SDE) method [48] was proposed to solve eigenvalue prob- 
lems, which is an expansion of the deep BSDE method. In the 
method, the eigenvalue problem is reformulated as a fixed- 
point problem of semigroup flow induced by the operator. 
Other studies [43, 44] proposed an alternative method to learn 
the eigenvalue problem by adding one or two regularization 
terms to the loss function. More recently, [49], a neural net- 
work framework based on the power method [50] was pre- 
sented to solve eigenvalue problems and smallest eigenvalue 
problems, where the eigenfunction is expressed by the neu- 
ral network and iteratively solved following the idea of the 
power method or inverse power method. However, the scope 
was limited to linear operators and certain special eigenvalue 
problems. 


In a recent study, PINN was applied to solve neutron diffu- 
sion equations [45, 51], where the authors used a free learn- 
able parameter to approximate the eigenvalue and a novel 
regularization technique to exclude null solutions from the 
PINN framework. A conservative physics-informed neural 
network (cPINN) was proposed in discrete domains for non- 
linear conservation laws [52]. Moreover, cPINN [53] was ap- 
plied to solve heterogeneous neutron diffusion problems in 
one-dimensional cases, which develops PINN for each sub- 
domain and considers additional conservation laws along the 
interfaces of subdomains (a general consideration in reactor 
physics [11]), which is involved in neural network training as 
the variable to be optimized. 


More recently, a data-enabled physics-informed neural net- 
work (DEPINN) [40] was proposed to solve neutron diffusion 
eigenvalue problems. To achieve acceptable engineering ac- 
curacy for complex engineering problems, it is suggested that 
a very small amount of prior data from physical experiments 
be used to improve the training accuracy and efficiency. In 
contrast to PINN, which solves the neutron diffusion eigen- 
value problem directly, an autoencoder-based machine learn- 
ing method in combination with the reduced-order method 
[54, 55] was proposed [56] to solve the same problem. How- 
ever, it still relies on solving governing equations with tradi- 
tional numerical methods such as the finite difference method. 


Although DRM provides a way to solve eigenvalue prob- 
lems with neural networks, as shown in Section IV B 2, re- 
sults indicate that DRM is not stable when solving two- 
dimensional cases. First, DRM learns the eigenvalue and 
eigenfunction at the early stage of the training process. Sub- 
sequently, DRM attempts to learn a smaller eigenvalue af- 
ter it is close to the true eigenvalue. Finally, DRM suc- 
cessfully learns one smaller eigenvalue that may be close to 
the true eigenvalue and one incorrect eigenfunction that is 
meaningless and far from the true eigenfunction. Addition- 
ally, the framework of a neural network based on the power 
method [50] is unsuitable for generalized eigenvalue prob- 
lems. Therefore, it is necessary to propose a new algorithm 
to solve K-eigenvalue problems. 


The study focuses on eigenvalue problems, which are also 
interface problems in which the eigenfunctions are continu- 
ous on the interface, and the derivatives of the eigenfunction 
are not continuous at the interface. Specifically, in the nuclear 
reactor physics domain, this is a general problem in which 
the reactor core is composed of fuel assemblies of different 
fissile nuclides enrichments [1]. Some studies focused on the 
use of neural networks to solve elliptic interface problems. 
Some researchers [57] use the idea of DRM and formulated 
the PDEs into the variational problems, which can be solved 
using the deep learning approach. They presented a novel 
mesh-free numerical method for solving elliptical interface 
problems based on deep learning. [58]. They employed dif- 
ferent neural networks in different subdomains and reformu- 
lated the problem as a least-squares problem. A similar case 
exists, in which the authors [53] enforce the interface con- 
ditions using piecewise neural network. In contrast to the 
methods, a discontinuity-capturing shallow neural network 
(DCSNN) [59] has been proposed for elliptic interface prob- 
lems. The crucial concept of DCSNN is that a d-dimensional 
piecewise continuous function can be extended to a continu- 
ous function defined in (d + 1)-dimensional space, where the 
augmented coordinate variable labels the pieces of each sub- 
domain. However, to the best of the authors’ knowledge, only 
a few studies focused on the use of neural networks to solve 
eigenvalue problems that incorporate interface problems in- 
volving regions of different materials. However, challenges 
exist at least on three fronts. 


e Designing a neural network that is more suitable for 
K-eigenvalue problems for more complicated/medium- 
size test problems. 


e Dealing with the interface problem in a more general 
and understandable manner when designing the neural 
network. 


e Proposing a framework effectively enhances the robust- 
ness of the neural network and improves the efficiency 
of utilizing the noisy prior data. 


To address the aforementioned challenges and advance be- 
yond the state-of-the-art research [45, 51, 53], we initially 
introduced the study [40]. This served as a preliminarily 
demonstration of the applicability of the PINN approach to 
reactor physics eigenvalue problems in complex engineering 
scenarios. The contributions of this study are as follows. 


e Firstly, we extend the Inverse Power Method Neural 
Network (IPMNN)[49] to the so-called Generalized In- 
verse Power Method Neural Network (GIPMNN) to 
solve the smallest eigenvalue and the related eigenfunc- 
tion of the generalized K-eigenvalue problems. Com- 
pared to DEPINN from our previous study [53], we 
omit the prior data in the training process and attempt 
to solve the K-eigenvalue problems from a data-free 
mathematical/numerical perspective. 


e Then, we propose a Physics Constrained GIPMNN 
(PC-GIPMNN) to address the interface problem in a 


more general and understandable manner with respect 
to previous studies [45, 51, 53]. 


Finally, we conduct a thorough comparative study of 
GIPMNN, PC-GIPMNN, and DRM using a variety of 
numerical tests. We evaluate the applicability and ac- 
curacy of these three methods using typical 1D and 2D 
test cases in reactor physics, particularly accounting for 
material discontinuities in different geometries. In the 
1D example, we determine the optimal ratio of outer to 
inner iterations, a finding that may be particularly rele- 
vant for GIPMNN. Additionally, we observe the failure 
of DRM in the 2D experiments, whereas PC-GIPMNN 
consistently outperforms GIPMNN and DRM over a 
set number of epochs. 


The rest of this paper is organized as follows. The gov- 
erning equations for the eigenvalue problems are presented in 
Sect. II. In Sect. II, we propose GIPMNN and PC-GIPMNN 
and introduce DRM in our cases. In Sect. IV, the results 
of 1D and 2D test cases are listed to verify the three methods. 
Finally, conclusions and future research are discussed in Sect. 
V. 


Il. K-EIGENVALUE PROBLEMS 


This section introduces the equations that govern the neu- 
tron criticality over a spatial domain. We recall Eqs. (1) and 
(2). The generalized K-eigenvalue problem can be formulated 
as follows: 


in Q, 


Lo = 46, 
{ on OQ, 9) 


Bọ =g, 


where domains 2 C R4, £, Q, and B denote the differential 
operators acting on the functions defined in the interior of 
Q and at the boundary of Q( OQ). Furthermore, denotes 
the eigenfunction of the system and AÀ denotes the associated 
eigenvalue. In this preliminary study, inspired by the notable 
work in [56], we utilize the one-group steady-state diffusion 
equation for criticality, framing it as a generalized eigenvalue 
problem. It is expressed as: 


—V-(DV¢) + D3 = Aud, (4) 


where eigenfunction ġo denotes the neutron flux, which is a 
scalar quantity used in nuclear reactor physics. It corresponds 
to the total length travelled by all free neutrons per unit time 
and volume. X°, £f, and v denote the absorption cross sec- 
tion, fission cross section, and average number of neutrons 
produced per fission event, respectively. We follow [56] and 
use the diffusion coefficient approximation 


1 
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Here, X° denotes the cross-section where a neutron scatters 
in a different direction. The eigenvalue A is obtained by mul- 
tiplying the neutron source in Eq. (4). The value balances the 


terms that produce neutrons with those that account for the 
losses. This is defined as the reciprocal of Keg, i.e., A = a 
where 


number of neutrons in one generation 


kett = 


number of neutrons in the preceding generation ` 


Two main boundary conditions are imposed on the diffu- 
sion equation. One condition represents a surface on which 
neutrons are reflected back into the domain (reflective condi- 
tion), and the other condition represents surfaces that allow 
neutrons to escape from the system (vacuum or bare condi- 
tion). Both the conditions are satisfied by relating the flux 
solution to its gradient on the boundary, i.e., 


1 
1? bare surface, 


-~5DV¢-n = (7) 


0 reflective surface, 


where n denotes an outward point normal to the surface. 


A. PINN as a eigenvalue solver 


In this subsection, we discuss using PINN to solve the gen- 
eralized eigenvalue problem (2). 

The eigenfunction of the operator £ is approximated by 
N°, i.e., d(x) = N°’ (x). Then, Lo and B¢ can be computed 
using the AD. For the boundary conditions: A penalty term 
(8) is added to the loss function in PINN, which penalizes the 
discrepancy between the approximated value on the boundary 
and exact boundary condition, where N; denotes the number 
of sampling points on OQ and z; is one point in the sampling 
set {a}... 


No 
Loss» = )-|B®(xi) — g(ai)|?. (8) 
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As mentioned previously, we are concerned with the small- 
est eigenvalue and associated eigenfunction. Hence, the 
eigenvalue (Rayleigh quotient) is viewed as a loss term in 
PINN to attain the lowest eigenvalue. 

Finally, the total loss function (10) in PINN corresponds to 
the weighted sum of the two objectives (8) and (9), where a 
and 8 corresponds to the weights. 


LosStotal = QÀ? + BLossp. (10) 


Unfortunately, PINN does not work for the cases in the 
study and even performs worse than DRM. Therefore, we 
only compared the results of our method with those of DRM. 


Remark 1 /t should be noted that the square of the eigen- 
value is used as the loss function in PINN. Given that the 
smallest eigenvalue implies that the absolute value is the 
smallest, PINN attempts to determine a negative infinity value 
without using a square term. 


Il. METHODOLOGIES 


In this section, we extend our previous study [49] and dis- 
cuss the use of a neural network to numerically solve the 
smallest eigenvalue problem. The main concept is to use a 
neural network to approximate the eigenfunction and com- 
pute the eigenvalue based on the Rayleigh quotient using 
an eigenvector expressed by the points calculated using the 
eigenfunction. 


A. Neural Network Architecture 


Next, we discuss the neural network structure used to ap- 
proximate eigenfunction ¢(a). The neural network architec- 
ture employed in the study is the same as ResNet [60], which 
is built by stacking several residual modules. It is one of the 
most popular models used in Deep Learning, and it is also 
commonly used in the field for solving PDEs via neural net- 
works [24, 61, 62]. Each module has one skip connection, and 
each block consists of two fully connected layers as shown in 
Figure 1. 

In the network, let x, £y € R? be the input and let W; and 
bı, l = 1, 2,3, 4 and 5 be the parameters in the fully connected 
layers. We use W „p and b, and k = 1 and 2 to denote the 
parameters in the residual connections. The results sı and s2 
for the modules can be expressed as follows: 


S1 = 0(W2(o(W, 24 b1)) + b2) + rı(x), (11) 
S2 = o(W4(o(W3s1 + b3)) + b4) + Tə(s1), (12) 


where o denotes the activation function and is chosen as 
tanh. Furthermore, rg, k = 1, and 2 are functions in the 
residual connections, which can represented as follows: 


Tk(Ek) = o(Wrktk + brk). (13) 
Subsequently, the neural network M? (æ) is expressed as 
N? (a) = W582(s1(x)) + bs. (14) 


Therefore, the eigenfunction (x) = N? (æ), where 0 de- 
notes neural network parameters. 


Remark 2 Given that $(a) is the scalar of the neutron pop- 
ulation and denotes the density of the free-moving neutron 
distribution over the spatial domain, we obtain $(x) > 0. 
Therefore, the eigenfunction can be expressed as ¢(x) = 


(N° (ax))?: 


B. Recap of Inverse Power Method Neural Network 


We consider the following linear eigenvalue problem: 
Lo = àQ, in Q, (15) 


which differs from the generalized eigenvalue problem Lọ = 


AQQ. 


Fig. 1. Neural network architecture consists of two modules and 
one linear output layer. Each module contains two fully-connected 
layers and a skip connection. 


Inspired by the idea of the inverse power method, IPMNN 
[49] was proposed to solve for the smallest eigenvalue and 
associated eigenfunction. Equation (16) depicts the key step 
of the inverse power method, and equation (17) is analogous 
to equation (16), where A denotes a matrix, and \,_; and 
@;,—1 denote the results of previous iteration. Therefore, A, 
and @, are obtained by using Eq. (16). 


Pk = Apg, 
_ Pk 
r= Tall (16) 
Àk = (Ady, Px) . 
(Pr; Pr) 
LO 
Sa = pı. (17) 


Here, the neural network N° represents the approximated 
eigenfunction ®; at the kth iteration of Eq. (17). Given that 
®;_;, which is obtained from the last iterative step and fol- 
lows the main idea of inverse power method, we must solve 
©; using Pk = Ltk and Pk = Px /||Px||. However, it 
is difficult to obtain the inverse operator L71 for the differen- 
tial operator £. Therefore, ®;, is obtained without knowing 
the inverse operator. The term £®,; is computed using AD in 


Eq. (17). The main idea is that it is not necessary to calculate 
L-1, and the eigenfunction can be approximated iteratively 
by minimizing the defined loss (18) to approach Eq. (17), 
where x; € S is the dataset, and N denotes the number of 
points in S. The eigenvalue in the kth iteration is obtained by 
using (19). 


(19) 


C. Generalized Inverse Power Method Neural Network 


In standard nuclear engineering procedures, we normally 
employ the inverse power method to solve for the smallest 
eigenvalue and associated eigenvector, which relies on the 
discretization of Equation (3). IPMNN [49] was proposed to 
solve the smallest eigenvalue problem, which is a mesh-free 
method realized by a neural network. However, the method is 
restricted to solving the equation Lo = Ad, which is a sim- 
ple form. Therefore, we propose GIPMNN to solve Eq. (3). 
Details of algorithm of GIPMNN is presented in Algorithm 
1. 

First, the generalized inverse power method is used to solve 
Eq. (20). The key step to focus on is given in Eq. (21), where 
A and B are two matrices and Aķ—1ı and @,_, are the results 
of the previous iteration. Therefore, Ag and @;, are obtained 
by using Eq. (21). 


Ad = \Bo¢. (20) 


Ag, = Av-1BO,-1; 
(Abn, $x) 21) 


Jpm ee 


(Bop, Pk) l 


We use a neural network M? to represent the approximated 
eigenfunction ®. 


LO, = Ax-1 VDP p-1, 
(L®,, Ph) (22) 
(Q®;,, By) 


In GIPMNN, Eq. (22) is an analog of Eq. (21), where 
£ and Q denote linear differential operators implemented 
by AD as opposed to specially discretized matrices. In a 
manner similar to the generalized inverse power method, we 
record the results Ax,_1 of previous iteration. In contrast to 
the generalized inverse power method, instead of recording 
P,_1, we record O®;_,. It should be noted that 6,_, de- 
notes the eigenfunction represented by the neural network in 
(k — 1)th iteration, and Q®;_, is realized by AD. In k-th 


Ae = 


iteration, we directly compute ®; using the neural network, 
i.e., ®, = N®, and calculate L®; using AD. We obtain ®; 
directly through the neural network instead of solving the 
equation LB, = A,_1Q®;,_1, We define the loss function 
LO8S8gipmnn iN Eq. (23) to propel the neural network to learn 
Pk. When the neural network converges, we obtain the small- 
est eigenvalue, and the associated eigenfunction can be ex- 
pressed using a neural network. 


N 
Los8gipmn = Y_|L®q (ai) — An—1Q®x-1(wa)|?. (23) 


i=1 


Remark 3 Jn the algorithm of GIPMNN, given that the initial 
| 


function ®o is not represented by the neural network, it is not 
possible to obtain Qo using the AD. Therefore, we chose an 
arbitrary function Q®o. 


The Neumann and Robin boundary conditions were used 
for the eigenvalue problem. It is difficult to enforce them by 
encoding the boundary conditions into a neural network, as 
in [49, 63, 64], where the Dirichlet and periodic boundary 
conditions are used. 

We form the loss function Loss, in Eq. (8), where No 
denotes the number of sampling points on ðQ, and a; is a 
point in the sampling set fa}. 

Therefore, the loss function is defined in Equations (24) 
and (25) as follows: where the surface indicates that neutrons 
are reflected back into the domain or that the surface allows 
neutrons to escape from the system. 


N» 
1 
> l- 3P(2:)V2(2:) n|? bare surface, (24) 
i=1 
Loss, = N 
~ 1 1 
dol-5 Pei) V O(@:) n— ge reflective surface. (25) 
i=1 


The total loss function (26) denotes the weighted sum of 
the objectives (23) and (8). For the process of GIPMNN, refer 
to Algorithm 1. 


LosStota = ALOS Sgipmnn + BLoOssy. (26) 


Remark 4 In Eq. (26), a and 6 denote the weights of the two 
losses. It is noted that 8 > a when training the neural net- 
work, particularly in the method GIPMNN. Given this issue, 
it is easy for a neural network to determine the eigenvalue 
and eigenfunction. 


D. Physics Constrained Generalized Inverse Power Method 
Neural Network 


Although GIPMNN can solve the eigenvalue problems of 
Eq. (3), it is still difficult to solve eigenvalue problems 
with discontinuous coefficients in different regions. We dis- 
cuss interface problems, which implies that the eigenfunc- 
tion may be continuous at the interface, and the derivatives 
of the eigenfunction may not be continuous at the interface. 
The enforcement of interface conditions is very important for 
GIPMNN. 

In the study, inspired by the idea of a piecewise deep neu- 
ral network [58], we propose a PC-GIPMNN to solve eigen- 
value problems with interface conditions. However, instead 


of employing different neural networks in different subdo- 
mains, we use only one neural network and multiple neurons 
in the output layer, as shown in Fig. 2. It should be noted that 
each neuron in the output layer corresponds to a subdomain. 
Therefore, we can obtain outputs in different subdomains that 
can be used to enforce the conditions at the interface. 
Suppose that there are two domains Q; and Q,, with an in- 
terface I’, which is the cross region between the two domains. 
Given the properties of the neutron population ¢(a), we can 
summarize that the eigenfunction will satisfy two interface 
conditions, i.e., (27) and Eq. (28), where ¢; and ¢, represent 
the eigenfunctions defined in Q; and Q,m, respectively, and 
n denotes the normal vector pointing from Q, to Q. Dı and 
D, are the coefficients defined in Q; and Qy, respectively, 
which are discontinuous at the interface. Equation (27) indi- 
cates that the eigenfunction is continuous at the interface, i.e., 
the neutron flux is continuous at the interface. Equation (28) 
indicates that neutron flow is continuous at the interface. 


{ Qi = br, (27) 
-DVi n = —D,Vop sn. (28) 


Assume that Sp corresponds to the set of points at the in- 
terface [ and |Sr| denotes the number of points in Sp. We 
then introduce two penalty terms to enforce the two interface 
conditions: Equation (29) and Eq. (30), where x; € Sr. 


Algorithm 1: GIPMNN for Finding the Smallest Eigenvalue 


Given that N is the number of points for training the neural network, N, denotes the number of points on the boundary OQ, Nepoch 
denotes the maximum number of epochs, Ao denotes the initial guess of the eigenvalue, and the stopping criterion e€. 

Step 1: Build data set S for loss function LosSgipmmn and data set Sp for loss function Loss». 

Step 2: Initialize a neural network with random initialization of parameters. 


Step 4: Given an arbitrary function O®o. 

for k = 1, 2 ERS » Nepoch do 

Input all points in S and S, into neural network N°. 
Let ®;,(a;) = N? (xi), where æ; € SU) Sb; 
LosSgipmnn = EN |LOx (xi) = Ak-1 Qp (x)|. 
Lossy = 0 j2°\|B®:(wi) — g(as))?, 

LosStota = ALOSSgipmm + Lossy, 


~ <b, PRS f ; 
Update parameters 0 of the neural network via gradient descent. 


if Losstorai < € then 

Record the best eigenvalue and eigenfunction. 

The stopping criterion is met, the iteration can be stopped. 
end 


end 


Loss; 


Lossj2 


By combining (26), (29), and (30), Equation (31) is the 
total loss defined in our method PC-GIPMNN, where a, 8, 7, 
and 6 are the weights of the different losses. In subsequent 
experiments, we chose 1. In the study, we focused on the 


Losstotat = ALOSSgipmnn + BP Lossy + yLoss;, + 6Lossiz. 


Moreover, the eigenfunction can be represented as: 


to) = lidi + lpdr, 


where, lı and l, denote the indicator functions, l = 1 in Qu, 
l =O0inQ,, lp = 1 in Q, andl, = 0 in Q. 


Remark 5 Conservative PINN (cPINN) [53] developed 
PINN for each subdomain and considered additional con- 
servation law along the subdomains’ interfaces (a general 
consideration in reactor physics [11]). However, in neural 
network training, eigenvalue is involved as a variable to be 
optimized, and the numerical examples that are presented 
correspond to only one-dimensional cases. Furthermore, the 
relative errors of keg in these cases are 4.4800 x 1074 and 
3.3500 x 1074. Similarly, we use Eqs. (29) and (30) to en- 
force the interface conditions. As shown below, our methods 
are more generic and yield better results. 


(32) 


Ok+1 = Ok — NVoLosstota(Ox), where n is the learning rate and 6; is the vector of parameters in k-th iteration. 


(29) 


(30) 


proposed algorithms and neglected the influence of weights. 
We are expecting that our proposed algorithms are universal 
and achieve better results, even without adjusting the weights. 
Therefore, we select all weights as 1. 


(31) 


Remark 6 In [45], the impact of the interface conditions was 
ignored, and the relative errors of key in their cases corre- 
sponded to 1.3 x 10-3 and 4.4 x 1073, respectively, and 
the study did not involve the smallest eigenvalue problem. In 
this study, we obtain the lowest eigenvalue using the inverse 
power method as opposed to using a free learnable param- 
eter to approximate the eigenvalue. Our numerical results 
demonstrate that accurate results can be obtained in more 
complicated cases. 


E. Deep Ritz Method 


DRM is a deep-learning-based method for numerically 
solving variational problems [24]. It reformulates the orig- 
inal PDEs into equivalent variational equations and defines 
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Fig. 2. Illustration of PC-GIPMNN architecture diagram. There are multiple neurons in the output layer, which denote the eigenfunctions in 


different subdomains. 


the loss function based on variational formulations. The so- 
lutions of PDEs are represented by a neural network, and the 
derivatives are calculated using AD. DRM [24] is also used 
to solve eigenvalue problems. Furthermore, we specify how 
to use DRM to solve Eq. (3). 

We consider the variational principle of the smallest eigen- 
values. 


Jo Lo- odx 
Jo Q4: pdz’ (33) 
s.t. Bolan =4Q9, 


min 


where Rayleigh quotient was used. The boundary conditions 
were enforced by adding a penalty term. 


min f |B¢ — g|*ds, (34) 
dQ 
and the total loss function LoSSarm is defined as: 


a Jo Lo: odx 


Lossam = 


+ ef IBọ -— g|°ds, (35) 
ôn 


where œ and 8 denote the weights of different losses. We 
chose a = 1 and 8 = 1 for our experiments. 

After the optimal approximation is obtained by solving the 
optimization problem (35), we obtain the smallest eigenvalue 


A= ce and eigenfunction represented by the trained 
Q 


neural network. It should be noted that Lo, O¢, and Bọ are 
computed using the AD. For the process of DRM, refer to 
Algorithm 2. 


Remark 7 We enforced the boundary condition by adding a 
penalty term, (34). However, if the boundary condition is the 
Neumann or Robin boundary condition, we do not use the 
penalty term (34) because the boundary condition is incorpo- 
rated into the Rayleigh quotient based on Green’s first iden- 
tity [65]. 


IV. EXPERIMENTS 


In this section, we present numerical experiments to 
compare the applicability and accuracy of GIPMNN, PC- 
GIPMNN, and DRM for solving the smallest eigenvalue 
problems in reactor physics. In all the experiments below, we 
chose the Adam optimizer with an initial learning rate 107° 
to minimize the loss function. Furthermore, we trained the 
neural network with the architecture of ResNet on a server 
equipped with CentOS 7 system, one Intel Xeon Platinum 
8358 2.60-GHz CPU, and one NVIDIA A100 80GB GPU. 
Moreover, unless otherwise specified, the activation function 
was selected as the tanh function. 


A. One-Dimensional Slab Reactor 


We consider one-dimensional case with a simple slab reac- 
tor consisting of a domain bounded by vacuum or bare sur- 
faces, as shown in Figure 3. The length of each slab reactor 
was 10 cm. It consists of fuel and control rod regions labeled 
1, 2, and 3. The control rods were located between 2.2 and 
2.5 cm and between 7.5 and 7.8 cm on the x-axis. 

As shown in Fig. 3, there were two control rods in the one- 
dimensional slab reactor. Either device can be withdrawn or 
inserted. Three scenarios were designed to model the reac- 
tor and completely simulate the actions of the control rods. 
The three situations are labeled Fl, F2, and F3 in Table 1. 
They indicate that both rods are withdrawn, only the left rod 
is inserted, and only the right rod is inserted. We also consid- 
ered three other problems, R1, R2, and R3 in Table 1. Prob- 
lem R1 is the same as F1, which is designed to simulate the 
withdrawal of both control rods. Here, we use R1 to facili- 
tate comparison with R2 and R3. Although problems R2 and 
R3 denote that all the control rods are inserted, problem R2 
resembles heavily inserted control rods, and problem R3 re- 
sembles slightly inserted control rods. 

To generate the necessary data to validate the accuracy of 
GIPMNN, PC-GIPMNN, and DRM, FreeFEM [67-73] is uti- 
lized to solve these problems in Table 1. FreeFEM is a partial 
differential equation solver for non-linear multi-physics sys- 


Algorithm 2: DRM for Finding the Smallest Eigenvalue 


Given N denotes the number of points for training the neural network, N, denotes the number of points on the boundary 0Q, Length 
denotes a measure of OQ, Nepocn denotes the maximum number of epochs, and e denotes the stopping criterion 

Step 1: Build data set S for the loss of the Rayleigh quotient and data set S, for the loss of boundary. 

Step 2: Initialize a neural network with random initialization of parameters. 


for k = 1, 2; ae , Nepoch do 
Input all points in S and S, into neural network No. 
Let P(x) = N? (æ), where x; € SU So; 


<QPk Pp > N, 


if Lossam < € then 
Record the best eigenvalue and eigenfunction. 


end 
end 
Region 1 
2.5 cm 
2.2 cm 
0 1 2 3 4 5 
Region 2 


Update parameters 0 of the neural network via gradient descent. 
Ok+1 = Ok — NVoJ (Ox), where 7 is the learning rate and 0x is the vector of parameters in k-th iteration. 


Lossarm =a <LO,,P,> 4: Bb Eensth pees |B®, (x) _ g(x:)|?. 


If the stopping criterion is met, then the iteration can be stopped. 


2.5 cm 
2.2 cm 
6 7 8 9 10} x axis (cm) 
Region 3 


Fig. 3. Macroscopic cross-sections for the 1D slab reactor, which is modified based on the figure reported in [66]. There are three regions 


with different materials and functions. 


Table 1. Six sets of material cross-sections in the work [66] are used 
to test GIPMNN and DRM. 


Problem Region S*(cm~') S*(em7') vd‘ (em7) 


Region! 0.4 2.0 0.5 
F1 Region2 0.4 2.0 0.5 
Region3 0.4 2.0 0.5 
Region! 0.4 2.0 0.5 
F2 Region2 0.6 2.0 0.3 
Region3 0.4 2.0 0.5 
Region! 0.4 2.0 0.5 
F3 Region2 0.4 2.0 0.5 
Region3 0.6 2.0 0.3 
Region! 0.4 2.0 0.5 
R1 Region2 0.4 2.0 0.5 
Region3 0.4 2.0 0.5 
Region! 0.4 2.0 0.5 
R2 Region2 0.6 2.0 0.3 
Region3 0.6 2.0 0.3 
Region! 0.4 2.0 0.7 
R3 Region2 0.5 2.0 0.4 
Region3 0.5 2.0 0.4 


tems using the finite element method. We choose number of 
cells in the z-direction N = 1001 and mesh size Ag = 1078. 
The solution of the baseline is obtained by FEM, and uniform 
cells are used to train GIPMNN, PC-GIPMNN, and DRM. 


Remark 8 When solving all the parameter-dependent prob- 
lems below, parameters i", X°, and ud! are selected based 
on whether the data point x belongs to a related region. For 
PC-GIPMNN, the data points on the interface are considered 
to belong to multiple regions simultaneously. Therefore, we 
can enforce the interface conditions using data points on the 
interface. 


Remark 9 As unsupervised algorithms, our methods use 
only points to train a neural network without any prior knowl- 
edge. Therefore, there was no test set for the proposed algo- 
rithm. 


1. Using GIPMNN to Solve for Eigenvalue 


In this one-dimensional example, we select 20 neurons for 
each hidden layer in ResNet and Nepoch = 50000. Without a 
loss of generality, the weights a and £ of the different losses 
are not adjusted to achieve better results. Therefore, a and 3 
were set to one. 

In Eq. (21), we must solve for ;, using the eigenvalues 
Ax—1 and @;,_,. To accelerate the training process and obtain 
more accurate results using Algorithm 1, we split the itera- 
tions in the original Algorithm 1 into inner and outer itera- 
tions, which can be observed in Algorithm 3, where Ninner 


10 


and Nouter denote the number of inner and outer iterations, was fixed at Nota = 100000 and Nota = Ninner X Nouter- 
respectively. The relative errors of Aer and eigenfunction for different ra- 
We chose the ratios of the outer and inner iterations as 1 : 1, tios of outer and inner iterations during the training process 


1: 10, 1 : 100, 1 : 1000, and 1 : 10000 to investigate the ef- are shown in Figure 4. The results of kal are shown in the 
fect of the ratio on the results. Ratio 1 : n implies that the first row and those of ¢*! are shown in the second row, where 
outer code is executed once, whereas the inner code is exe- the relative errors of kett and eigenfunction are calculated by 
cuted n times. For comparison, the total number of iterations 


Ker( FEM) — ketr(NN)| 


rel __ 
ket = kal FEM) i n 
,  max(\d(FEM) — o(NN))) 
gre = x ? (37) 


maz([6(FEM)]) 


respectively. As shown in the figures, the best ratio is 1 : 1 the inverse power method. 
because the relative errors of keg and eigenfunction of the 
ratio 1 : 1 is relatively smaller than the others when training 


the neural network. The convergence worsens when the ratio 2. Using PC-GIPMNN to Solve for Eigenvalue 

of the outer and inner iterations changes from 1 : 1 to 1 : 

10000. Therefore, we trained GIPMNN using a ratio 1 : 1 to We divided the slab reactor into five parts from left to right. 
solve 1D and 2D problems. The output layer included five neurons, and five functions 


Moreover, we fixed the number of outer iterations were defined in different subdomains. Here, u, w, and q 
Nouter = 1000 and retrained the neural network using the denote the functions defined in Region 1, v and p are those 
ratios. The results are shown in Fig. 5. The opposite results defined in Regions 2 and 3, respectively. 
are observed when compared with the results in Fig. 4. The As shown in Figure 3, the four points are denoted as 
ratio 1 : 1 is the worst ratio because increases in inner iter- £i1=2.2, £i2=2.5, 1343 = 7.5, and z;4 = 7.8. The interface 
ations lead to a better approximation of the eigenfunction in conditions (29) and (30) are implemented as loss functions 
the next outer iteration. This is consistent with the results of defined in (38) and (39) in the 1D example. 


Lossi = |u(aa) — v(zn)|? + lv(zi2)— w(zi2)|? + |w(ais) — plzi3)l? + |p(aea) — a(z)’, (38) 
Lossig = |(—Diuz(#i1)) — (—Dove(air))|? + |(—Dov2(xi2)) — (—Diwe2(xi2))/? 
+|(—Diwe(#is)) — (—Dspe(wis))|? + |(—Dspe(wia)) — (—Dige(wia))|?. (39) 
The eigenfunction can be expressed as follows: 3. Using DRM to Solve for Eigenvalue 
$ = ul; + vlz + wlz + pla + als, (40) The configuration of DRM is identical to that of GIPMNN. 


Given that it is a variational formulation, and the boundary 

condition is incorporated into the Rayleigh quotient, we do 
where l1, l2, 13, l4, and ls are indicator functions that are 1 or not use the penalty term to enforce the boundary condition. 
0 based on the input x inside or outside the subdomain. The loss function Loss¢ym is defined as: 


Lengths SN | DIV o(ai)|? + 4 (P(ar)? + o(ar)?) + Lert ON | NG (ai)? 


Lossam = Baath WV ; (41) 
. “4 se ini ULF o(a)? 
l 
where Lengths» denotes the length of the slab reactor, 2; of the slab reactor, and N denotes the number of points used 


and x, denote points that show the positions of the endpoints to approximate the integral. Therefore, the smallest eigen- 
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Fig. 4. (Color online) Relative error of ker and @ for problems F1, F2, and F3 (from (a) to (c)) with respect to different ratios of outer and 
inner iterations during training process. The first row shows the results of kt% and the second row shows the results of $"'. The neural network 


is trained with Nota = 100000. 
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Fig. 5. (Color online) Relative error of ker and @ for problems F1, F2, and F3 (from (a) to (c)) with respect to different ratios of outer and 
inner iterations during training process. The first row shows the results of kt and second row shows the results of ¢'. The neural network is 


trained with Nouter = 1000. 
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Algorithm 3: Iterations in Algorithm 1 are split into inner iterations and outer iterations. 


for k = 1.2; 2r , Nouter do 
for j = 1,2,--- , Ninner do 
Input all points in S and Sp into neural network N°. 
Let Pp (x1) =.NV°(ai), where x; € SU Sp; 
Lossgipmnn = Dyjn1|LOx (ai) — Ak-1 Qr- (wi) |?. 
Loss» = Z, |B®, (ai) — g(2:)}P, 
LO8S8total = ALOSSgipmnn + BLossp, 
if 7 = Ninner then 

Pr-1 = Pz/ ||P]. 

d = <(L®);,, Pp, > 

k=1 <(QP)k Pk >? 

end 


(kj)-th iteration. 

end 

if LOSStotal < € then 

Record the best eigenvalue and eigenfunction. 


end 
end 


value À can be obtained after the neural network converges. 


4. Results 


The results for the one-dimensional example are listed in 
Tables 2, 3, and 6. The values of Kege and their relative errors 
are listed in Table 2. For problems F1 and R1, which have 
continuous coefficients, the results obtained by GIPMNN are 
better than those obtained by PC-GIPMNN and DRM. For 
problem F2, the relative error of keg obtained by DRM is bet- 
ter than those obtained by other methods. For other problems 
with discontinuous coefficients, the results obtained using the 
PC-GIPMNN are better than those obtained by GIPMNN and 
DRM. The relative error of keg computed by PC-GIPMNN is 
approximately 1075, which is lower than 10~4 as computed 
by DRM. The relative errors in the eigenfunctions are listed 
in Table 3. For all problems, PC-GIPMNN can attain better 
results than GIPMNN and DRM. 

In Fig. 6, the eigenfunctions of GIPMNN, PC-GIPMNN, 
and DRM are compared with those of FEM. Specifically, we 
follow the conventional normalization process in nuclear re- 
actor physics [1], where the normalization constant is gen- 
erally computed to make the average reactor power equal to 
unity; thus, the eigenfunctions are normalized by Eq. (42), 
where N denotes the number of training points in the entire 
domain. 


oe 
ae b(xi) 


The figures in the first row show the results for problems F1, 
F2, and F3 and those in the second row show the results for 
problems R1, R2, and R3. In each figure, we plot the eigen- 
function obtained by FEM and compare the relative errors 
of the eigenfunctions computed by GIPMNN, PC-GIPMNN, 


@. (42) 


norm = 


Update parameters 0 of the neural network via gradient descent. 
Ok j+1 = Oki — NV eo LosSStotal (Ok, j), where 7 denotes the learning rate and 0x,; denotes the vector of parameters in 


If the stopping criterion is met, then the iteration can be stopped. 


and DRM. Evidently, the results obtained by PC-GIMPNN 
are better than those obtained by the other methods, which is 
consistent with the results in Table 3. 


As reported in a previous study, [49], IPMNN can attain 
more accurate eigenvalues when compared to those obtained 
with DRM when linear eigenvalue problems without inter- 
face are considered. Therefore, IPMNN and GIPMNN are 
suitable to determine the eigenfunction of a problem with a 
strong form, and DRM is similar to FEM in that DRM is 
applicable for finding the eigenfunction of a problem with 
a weak form. Consequently, DRM is better than GIPMNN 
in this one-dimensional example with discontinuous coef- 
ficients. However, given that the interface conditions are 
well implemented in PC-GIPMNN, it successfully learns the 
eigenvalue and eigenfunction and achieves better results than 
GIPMNN and DRM, as shown in Tables 2 and 3. 


Remark 10 Specifically, DRM is applicable to determine the 
eigenfunction of a problem with a weak form, which implies 
that the eigenfunction exhibits low regularity. Subsequently, 
as shown in the implementation of GIPMNN, it is necessary 
for the eigenfunction obtained from GIPMNN to exhibit high 
regularity. Therefore, the learned eigenfunction is in a strong 
form. Hence, GIPMNN is unable to obtain accurate values 
at the interface. However, PC-GIPMNN does not require the 
eigenfunction to exhibit a higher regularity at the interface 
but instead guarantees continuity and physical constraints by 
realizing the interface conditions. Therefore, PC-GIPMNN 
successfully learns the eigenvalue and eigenfunction and ob- 
tains better results when compared to GIPMNN and DRM. 
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Table 2. Results obtained by GIPMNN, PC-GIPMNN, and DRM compared with those obtained by FEM for problems in Table 1. ki} denotes 


the relative error of Kegr. 


Problem kegerem) Ketpcipmnn) Kepeiec-ciemnn) KeffORM) klarem 


kiSlec-orMmNN) kilorm 


FI 1.2127 1.2127 1.2127 1.2128 2.5142 x 10 © 3.7743 x 10 © 5.8568 x 10° 
F2 1.1973 1.1970 1.1973 1.1972 2.3429 x 1074 2.6246 x 107^ 1.1679 x 1074 
F3 1.1973 1.1972 1.1973 1.1971 5.5503 x 107° 1.6898 x 107° 1.6809 x 1074 
RI 1.2127 1.2127 1.2127 1.2128 2.5142 x 10`" 3.7743 x 10-° 5.8568 x 10°" 
R2 1.1715 1.1697 1.1715 1.1707 1.5629 x 107? 3.2020 x 10~° 7.2244 x 107 
R3 1.6498 1.6484 1.6498 1.6487 8.5897 x 1074 2.9838 x 107° 6.7968 x 107+ 


Table 3. Results obtained by GIPMNN, PC-GIPMNN, and DRM compared with those obtained by FEM for problems in Table 1. ¢"' denotes 


the relative error of eigenfunction. 


Problem "crum rel PC-GIPMNN) rel brm 

Fl 2.7301 x 10-* 6.3620 x 10~* 2.6313 x 10-7 
F2 2.6279 x 107? 1.6140 x 107? 8.2217 x 1073 
F3 2.7403 x 107? 1.0120 x 107? 6.2428 x 1073 
RI 2.7301 x 10`? 6.3620 x 1077 2.6313 x 1077 
R2 2.5768 x 107? 7.5946 x 1074 6.8807 x 107? 
R3 1.4559 x 107? 7.5257 x 107^ 8.1038 x 107 


B. Two-Dimensional Reactor 


As shown in Figure 7, a two-dimensional reactor is mod- 
eled in a square-shaped domain with 90-cm sides. The reactor 
was surrounded by a neutron reflector with graphite material, 
which implies that Robin boundary condition was selected. 
The main bulk of the reactor corresponds to the fuel region. 
Within its central region, four control rods can be inserted or 
withdrawn. When the control rods are withdrawn, materials 
in the regions are replaced with water, which corresponds to 
common practice in many reactor designs. Table 4 lists five 
different materials used in the mock reactor. However, the 
two types of fuels in Table 4 are designed to simulate differ- 
ent fuel materials. Fuel type 1 defines the standard fuel used 
in most problems, and fuel type 2 defines an adjusted fuel 
composition that is different from fuel type 1. Fuel type 2 
in problem R7 was used to test whether our methods can be 
affected by different types of fuels. 


As shown in Table 5, there are 12 problems in validating 
the accuracy of the proposed method. Five full models which 
are labeled as FI—F5, were proposed to simulate the reactor 
with all control rods removed and then with only one control 
rod inserted in the regions of the control rods. As previously 
discussed, when the control rod was removed, the material 
in the region was replaced with water. Therefore, W is used 
to denote water in Table 5. Other seven reactor configura- 
tions, denoted by R1-R7, were proposed to simulate the cases 
where more control rods were inserted. Problems R1 and R2 
are equivalent to the full model problems F1 and F2: Prob- 
lems R3-R6 utilized different combinations of inserted and 
rejected control rods. It should be noted that in problem R7, 


the material configuration differs from the material configu- 
ration of other problems. The fuel type was replaced with 
fuel type 2, and the control rods were assumed to be partially 
inserted, which implies that the materials in the regions cor- 
responded to a mix of control rods and water materials. 

In the two-dimensional case, we use FreeFEM to solve 
the problems listed in Table 5. We chose uniform grids 
with Az = 5 and Ay = +. We trained GIPMNN, PC- 


90 
GIPMNN, and DRM with N, = 91 and N, = 91. 


1. Using GIPMNN and PC-GIMPNN to Solve for Eigenvalue 


The number of points N = N,Ny is used to calculate 
LosSgipmnn and number of points N, = 2(N, — 2) + 2(.N, — 
2) + 4 is used to calculate Loss). The number of neurons 
is 20 for each hidden layer in ResNet and Nepoch = 500000 
for both GIPMNN and PC-GIPMNN. Without a loss of gen- 
erality, œ and ( are set to one. As mentioned previously, we 
use the optimal ratio of the outer and inner iterations, which 
is1:1. 

The 2D reactor is divided into six parts, as shown in Fig- 
ure 7. The output layer includes six neurons, and there are 
six functions that are defined in different subdomains and are 
labeled as u, v, w, r, p, and q, where u, v, w, and r denote 
functions defined in CR1, CR2, CR3, and CR4, and p and q 
denote functions defined for Fuel and Graphite, respectively. 
Scri, SCR2, Sor3, Scr, and Sqr denote different datasets 
at different interfaces. 

For PC-GIPMNN, interface conditions (29) and (30) are 
implemented as loss functions (43) and (44) in the 2D exam- 


ple. 
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Fig. 6. (Color online) Eigenfunctions obtained by GIPMNN, PC-GIPMNN, and DRM are compared with those obtained by FEM for one- 
dimensional example. The eigenfunctions are normalized. The first row shows the results of problems F1, F2, and F3, and the second row 


shows the results of problems R1, R2, and R3. 
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Fig. 7. Geometry of a 2D reactor core with graphite, fuel, and four control-rod regions labeled as 1, 2, 3, and 4. The figure is similar to the 


figure in a previous study [66]. 
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Table 4. Cross section data for various materials of the 2D reactor. This data are similar to those reported in a previous study [66]. 


Material S*(cm~') S8(cm~') v¥/(cm7!) 
Fuel 1 0.075 0.53 0.079 

Fuel 2 0.072 0.53 0.085 
Water 0.01 0.89 0.0 

Control rod 0.38 0.2 0.0 
Graphite 0.15 0.5 0.0 


Table 5. Configurations for reactor problems with different materials. The configurations are similar to those in a previous study [66]. C-R 
denotes the control rod region, CR denotes a control rod material, and W denotes water. 


Problem Fuel type C-R1 C-R2 C-R3 C-R4 
F1 Fuell W W W W 
F2 Fuell CR W WwW W 
F3 Fuell W CR Ww W 
F4 Fuell W W CR W 
F5 Fuell W W W CR 
R1 Fuell W W W W 
R2 Fuell CR W W W 
R3 Fuell W CR CR Ww 
R4 Fuell W W CR CR 
R5 Fuell CR CR W CR 
R6 Fuell CR CR CR CR 
R7 Fuel2  $CR+3W =CR+3W 3CR+5W =CR+2W 
|Scril |Scra| |Scrs| 
Lossa = >> lulz) -pled + D> l(a) — ple S lwla: — pla) 
i=l i=l i=l 
|Sora| \Scr| 
+ D æ) -peP + Y ples) - ae, (43) 
i=1 i=1 
[Scr] [Sc r2] 
= 2 2 
Lossi XO \(—DeriVu(@i) +n) — (—DruetVp(@i) n)? + XO |(—Dor2Vo(@i) n) — (—DruetVp(@i) - n)| 
i=l i=1 
|Scrs| |Scra| 
T 5 |(—Dor3Vw(xi) ` n) = (—DFrueiVp(zi) n)|? + 5 |(—DoraVr (xi) n) _ (—DrueiVp(x:) n)|? 
i=1 i=1 
[Scr] 


i=l 


T 5 |(—DruetV p(xi) j n) = (—DerapniteV (Xi) e n)|?. 


The eigenfunction is expressed as follows: 


$ = ul; + vla + wl3 + rly + pls + ale, (45) 


where 11, l2, l3, l4, l5 and le denote indicator functions. 


Remark 11 Jn the experiment, it was important to use Eq. 
(42) instead of the original L2 norm, which is too small to 
optimize the neural network. Specifically, the L2 norm of the 


LOSSdrm = 


A square “square 
Areta SON | DIV O(ai)|? + gme ON DG (ai)? 


eigenfunction corresponds to one if we attempt to find a nor- 
malized eigenfunction. If the total number of points N is ex- 
cessively high, then the value of each component of the eigen- 
vector is excessively low to such an extent that it is difficult for 
the neural network to learn the eigenfunction. 
2. Using DRM to Solve for Eigenvalue and the failure of DRM 
during the 2D Experiments 


Given that the homogeneous Neumann boundary condition 
is used, the loss function in DRM can be defined as: 


Ar eQgquare N 
eae yt vif olz)? 


, (46) 


where Aredsquare denotes the area of the square domain as 
shown in Fig. 7. 

We use the number of epochs Nepoch = 500000 in DRM. 
First, we found that the DRM can learn the eigenvalues and 
eigenfunctions at an early stage of the training process. Sub- 
sequently, the DRM attempts to learn a smaller eigenvalue 
after it is close to the true eigenvalue. Finally, the DRM 
successfully learns one smaller eigenvalue that may be close 
to the true eigenvalue and one terrible eigenfunction that is 
meaningless and far from the true eigenfunction. This phe- 
nomenon is illustrated in Fig. 8. 

As listed in Table 6, although the neural network in DRM 
fails to learn the eigenfunction, the eigenvalue is close to the 
true value. This phenomenon may be caused by minimiza- 
tion of Rayleigh quotient. As mentioned in [49], the loss ap- 
proaches zero, whereas the eigenvalue may not reach zero. 

In Figure 8, we observe that the failure of DRM during 
the 2D experiments occurs after Nepoch > 60000. Therefore, 
we select Nepoch = 50000 to train DRM again and record the 
results. In a previous study [40], the stopping criteria of train- 
ing process for PINN was investigated. In future studies, we 
will follow the technique discussed in [40] to determine the 
stopping criteria for DRM. In the next section, we compare 
the results of the DRM trained with Nepoch = 50000 with the 
results of GIPMNN and PC-GIPMNN. 


Table 6. Failure of DRM during the 2D experiments. The eigen- 
function of F1-F5 and R1-R7 is not correct, but the eigenvalue is to 
the true value. 


Problem kertem Kerrey kelorm rel ORM) 
Fl 1.0118 1.0423 3.0156 x 10-7 16.6580 
F2 1.0052 1.0399 3.4538 x 107? 21.8882 
F3 1.0000 1.0428 4.2782 x 107? 29.5368 
F4 1.0079 1.0533 4.5361 x 107? 26.4549 
F5 1.0052 1.0450 3.9598 x 107°? 32.0516 
R1 1.0118 1.0423 3.0156 x 10-7 16.6580 
R2 1.0052 1.0399 3.4538 x 107? 21.8882 
R3 0.9921 1.0399 4.8250 x 107? 166.2996 
R4 1.0017 1.0397 3.7884 x 107? 163.6593 


R5 0.9780 1.0498 7.3430 x 107? 182.7224 
R6 0.9668 1.0492 8.5210 x 107? 87.2200 
R7 1.1018 1.1517 4.5337 x 107° 36.6777 


Remark 12 The numerical results in Fig. 8 are not due to 
hardware problems or code errors but can be attributed to 
the nature of the neural network. The eigenvalue was ap- 
proximated by constructing a Rayleigh quotient in the DRM. 
Then, the eigenvalue is treated as a loss function to optimize 
the neural network. However, this mechanism of minimizing 
the eigenvalue leads to overfitting of the neural network, as 
the neural network always attempts to find a point where the 
loss function tends toward zero. 


3. Results 


Similar to the results of the one-dimensional case, the rel- 
ative errors of ke and eigenfunction ¢ are shown in Tables 7 
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and 8. It can be observed that both the relative errors for Kegr 
obtained via all the methods are small, and the results for the 
DRM are trained with Nepoch = 50000. 


For problems F1, F2, F4, and R7, the relative error of 
kere obtained by DRM was smaller than that obtained by 
GIPMNN, except for problems F3, F5, R3, R4, R5, and 
R6. However, although the relative error of kee obtained 
by GIPMNN is small, the relative error of ¢ simulated by 
GIPMNN is larger than that obtained by DRM. It is observed 
that k! obtained by the PC-GIPMNN is smaller than that ob- 
tained by GIPMNN and DRM for all problems. Furthermore, 
the relative errors of ¢ computed by the PC-GIMPNN are 
smaller than those obtained by the GIPMNN and DRM for 
all problems. Therefore, the PC-GIPMNN can successfully 
learn eigenvalues and eigenfunctions. 


In Fig. 9 and 10, the eigenfunctions computed by the FEM 
are shown in the first column, and the relative errors of the 
eigenfunctions obtained by the GIPMNN, PC-GIPMNN, and 
DRM are shown in the other columns for different problems. 
It is observed that the relative errors of the eigenfunction com- 
puted by the PC-GIPMNN and DRM are smaller than those 
obtained by the GIPMNN, which failed to learn some details. 
Compared to the eigenfunctions computed by the FEM, the 
results obtained by the PC-GIPMNN are the best among the 
three methods. 


C. 2D IAEA Benchmark Problem 


We also considered the classical 2D IAEA benchmark 
problem reported in the study by Yang et al. [40], which 
was modeled using two-dimensional and two-group diffusion 
equations. Here, one-group neutron diffusion equation, de- 
fined in Eq. (4)m is considered, and multigroup problems are 
considered in our future study. Its geometry is shown in Fig. 
11. The main bulk of the reactor consists of two fuel regions, 
labeled 1 and 2, representing the two types of fuel materials. 
Within its central region, there are four control rods, which 
are all labeled as 3. The last region, labeled 4, was composed 
of water. Cross-sectional data for the 2D IAEA benchmark 
problem are presented in Table 9. It is worth noting that only 
one quarter of the reactor is shown in this figure because the 
rest can be inferred by symmetry along the x- and y-axes. 
Therefore, this 2D IAEA benchmark problem is confined to 
the two types of boundary conditions defined in Eq. (7). The 
problem is confined to the Neumann boundary condition on 
the x- and y-axes and to the Robin boundary condition on the 
other boundaries. 


In this two-dimensional case, we used FreeFEM to solve 
the 2D IAEA benchmark problem with the parameters listed 
in Table 9. We selected uniform grids with Ax = To and 


Ay = To: We trained GIPMNN, PC-GIPMNN and DRM 


with N, = 171 and N} = 171. 
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Fig. 8. (Color online) Failure of DRM during the 2D experiments. DRM fails to learn the eigenfunctions of F1-F5 and R3-R7. 


vw" Table 7. Results obtained via GIPMNN, PC-GIPMNN, and DRM compared with FEM for problems in Table 5, ki: denotes the relative error 
= of kett. Especially, DRM is trained with Nepoch = 50000. 


Problem Kepgrem) Ketpcipmnn) Kepeiec-ciemnny KeffORM) kee omy) Ie ec.cIpMNN) kl orm 

F1 1.0118 1.0094 1.0121 1.0096 2.3574 x 10? 2.8227 x 10-7 2.1558 x 1073 
F2 1.0052 1.0023 1.0055 1.0024 2.8782 x 107 2.9864 x 1074 2.7674 x 107° 
F3 1.0000 0.9968 0.9999 0.9965 3.2380 x 107? 1.2475 x 1074 3.4253 x 107% 
F4 1.0079 1.0053 1.0081 1.0054 2.5913 x 107 2.4834 x 1074 2.4439 x 107 
F5 1.0052 1.0025 1.0054 1.0024 2.7120 x 107° 3.1435 x 1074 2.7394 x 107 
RI 1.0118 1.0094 1.0121 1.0096 2.3574 x 10`? 2.8227 x 10-7 2.1558 x 1077 
R2 1.0052 1.0023 1.0055 1.0024 2.8782 x 107 2.9864 x 1074 2.7674 x 107 
R3 0.9921 0.9937 0.9927 0.9878 1.5981 x 107? 6.0649 x 1074 4.3297 x 107° 
R4 1.0017 1.0014 1.0015 0.9987 3.5502 x 1074 2.4851 x 1074 3.0409 x 107° 


R5 0.9780 0.9759 0.9781 0.9721 2.1223 x 107 1.6160 x 1074 6.0148 x 107° 
R6 0.9668 0.9639 0.9680 0.9584 3.0188 x 107? 1.2722 x 107° 8.6348 x 107° 
R7 1.1018 1.0929 1.1020 1.0953 8.0309 x 107 2.1827 x 1074 5.9042 x 107° 


i Table 8. Results obtained via GIPMNN, PC-GIPMNN, and DRM when compared with FEM for problems in Table 5, ga denotes the relative 
- = error of the eigenfunction. Especially, DRM is trained with Nepoch = 50000. 


Problem "ceux G! ec-cirmnn) orm 

QO Fl 4.6050 x 107? 3.0937 x 107?” 4.4186 x 1077 
F2 8.9731 x 107? 3.6713 x 107? 7.0763 x 107? 
F3 8.9385 x 107? 4.1200 x 107? 6.7164 x 107? 
F4 8.5110 x 107? 4.3430 x 107? 5.9907 x 107? 
F5 7.9413 x 107? 4.6256 x 107? 6.6049 x 107? 
RI 4.6050 x 10-7 3.0937 x 10-7 4.4186 x 1077 
R2 8.9731 x 107? 3.6713 x 107? 7.0763 x 107? 
R3 1.1047 x 107+ 6.2990 x 107? 8.4720 x 107? 


R4 1.3013 x 107! 2.9511 x 107? 6.8647 x 107? 
R5 2.5852 x 107} 2.9718 x 107? 7.7923 x 107? 
R6 4.6253 x 107! 5.1511 x 107? 9.1424 x 107? 
R7 1.1612 x 107! 1.8635 x 107? 7.2148 x 107? 


Table 9. Cross section data for the 2D IAEA benchmark problem. 
Region ©@(cm~+) 58(cm7+) vf (cm7!) Material 


1 2.0 0.53 0.079 Fuel 1 
2 0.087 0.55 0.085 Fuel 2 
3 0.38 0.20 0.0 Control rod 
4 0.01 0.89 0.0 Water 
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Fig. 9. (Color online) First column shows the heatmap of the eigenfunction of FEM (the first column) and the other columns show the 
heatmaps of the relative error of GIPMNN (the second column), PC-GIPMNN (the third column), and DRM (the fourth column) for problems 
F1, F2, F3, F4, and F5. Evidently, GIPMNN less favorable results than DRM. However, by enforcing the interface conditions, PC-GIPMNN 


outperforms GIPMNN and DRM, as shown in the third column. 


1. Using GIPMNN and PC-GIMPNN to Solve for Eigenvalue 


The number of points N = N,N, is used to calcu- 
late LosSgipmnn. The number of points Ny» on the z- and 
y-axes and number of points Np» on the other boundaries 
were used to calculate Lossy and Losspp, which enforced 
the Neumann and Robin boundary conditions. The num- 
ber of neurons was 20 for each hidden layer in ResNet and 
Nepoch = 500000 for GIPMNN and PC-GIPMNN. As men- 


tioned previously, we used the optimal ratio of the outer and 
inner iterations, which was 1 : 1. 


The 2D reactor is divided into seven parts, as shown in Fig. 
11. The output layer has seven neurons, and seven functions 
are defined in different subdomains and labeled as u, v, w, r, 
p, q, and h, where u, v, w, and r denote the functions defined 
in the control rods and p, q, and h are the functions defined 
in the fuel and water regions, respectively. Sup, Sup, Swp» 
Srp, Srqs Spq> and Sqn denote different datasets at different 
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Fig. 10. (Color online) First column shows the heatmap of the eigenfunction of FEM (the first column) and the other columns show the 
heatmaps of the relative error of GIPMNN (the second column), PC-GIPMNN (the third column), and DRM (the fourth column) for problems 
R3, R4, R5, R6, and R7. Obviously, GIPMNN yields less favorable results than DRM. However, by enforcing the interface conditions, PC- 
GIPMNN outperforms GIPMNN and DRM, as shown in the third column. 


interfaces. (30) are implemented as the loss functions (47) and (48), re- 
For the PC-GIPMNN, the interface conditions (29) and spectively, in the 2D example. 
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Fig. 11. (Color online) Geometry of the 2D IAEA benchmark problem with two fuel regions, four control-rod regions, and a water region 
labeled as 1, 2, 3m and 4. This figure is similar to the figure reported in a previous study [40]. 
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Fig. 12. (Color online) First column shows the heatmap of the eigenfunction of FEM (the first column) and the other columns show the 
heatmaps of the relative error of GIPMNN (the second column), PC-GIPMNN (the third column), and DRM (the fourth column) for the 2D 
IAEA benchmark problem. However, by enforcing the interface conditions, PC-GIPMNN outperforms GIPMNN and DRM, as shown in the 
third column. 


[Supl |Sup| | Swo| 
Lossi; = 2, \u(a;,) — plæ;)|? + » \u(ai) — p(a)|? + x |w(a;) — p(a,)|? 
[Srp ‘Seal Spal 
+ 2 Ir(æ;) — p(x) |? + » Ir(æ;) — g(a,)|? + 9 |plæ:) — q(ai)|? 
[San] 


+ 2, lqlæ:) — h(æ:)l?, (47) 
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Lossiz = X |(—D3Vu(ai) +n) — (—D2Vp(ai) - 
j=l 
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[Svp] 


I? + $ \(—DsVo(@i) - n) — (—D2Vp(ai) - n)|? 


i=1 
[Srp] 


+ X |(~D:Vuw(z;) : n) — (~D2Vp(z:) : n)? + $ (-D:Vr(z:) : n) — (—D2Vp(ai) : n)|? 


i=1 
[Spal 


+ X I(-DsVr(z:) - n) — (—DiVq(ai)-n))? + $ (-D2Vp(2:) : n) — (—DiVq(#i) - n)|? 


i=1 


+ XL \(-—DiVaq(ai)-n) — (—DaVh(ai) + n)/?. (48) 


i=1 


The eigenfunction can be expressed as follows: 
(0) = uly + vula + wls + rla + pls + qle + hlz, (49) 


where l1, lo, l3, l4, l5, le, and ly are the indicator functions. 
2. Using DRM to Solve for Eigenvalue 
Given that the homogeneous Neumann boundary condi- 


tion is used, the loss function in DRM omits the impact of 


J= Nrb 


Are ins DIVE (@s)? + a 


the Neumann boundary condition and focuses on the Robin 
boundary condition. The loss function is defined as follows: 


it gole)? + Ate ON, DAG (ai)? (50) 


where Area denotes the area of all regions, Length indicates 
the length of the boundaries other than the x- and y-axes in 
Fig. 11, and Ng denotes the number of points on the Robin 
boundary. 


3. Results 


As discussed above, we also trained the DRM with the 
number of epochs Nepoch = 500000 and found that DRM 
failed to learn the eigenfunction again. In this case, DRM at- 
tains a good ket = 0.9750 and the relative errors of kef and 
ġ are 6.4023 x 107° and 0.9557, respectively. Therefore, we 
retrained the DRM with Nepoch = 50000 and stored the best 
results. 

The relative errors of keg and eigenfunction ¢ are shown in 
Table 10 and Table 11. It was observed that all three methods 
obtained good results, and the relative errors of ker of DRM 
were small. However, the ability of DRM to learn the eigen- 
function was worse than that of GIPMNN and PC-GIPMNN 
and the relative errors of ¢ were the largest. Thus, it can 
be concluded that the PC-GIPMNN successfully learned the 
eigenvalues and eigenfunctions. The same conclusion can be 
drawn from the graphs in Fig. 12. 


area ON wel ole.) 


H 


For the 1D slab reactor, the computation time of FEM is 
1.25 s, and the training times of DRM, GIPMNN, and PC- 
GIPMNN are 4788.37 s, 10614.57 s, and 16052.16 s, respec- 
tively. For the 2D reactor, the computation time of FEM is 
3.64 s, and the training times of DRM, GIPMNN, and PC- 
GIPMNN are 5827.91 s, 18444.39 s, and 108072.41 s, respec- 
tively. For the 2D IAEA benchmark, the computation time of 
FEM is 5.22 s, and the training times of DRM, GIPMNN, and 
PC-GIPMNN are 29352.83 s, 64546.74 s, and 137812.92 s, 
respectively. 

Although PC-GIPMNN was better than DRM and 
GIPMNN, it required significantly more training time to ob- 
tain accurate results. Compared with FEM, neural network 
methods require excessive time. However, neural networks 
are an emerging method and we believe that they will achieve 
better results in the near future. 


V. CONCLUSION 


In this study, we proposed two methods, GIPMNN and PC- 
GIPMNN, to solve generalized K-eigenvalue problems in nu- 
clear reactor physics. We also conducted a comprehensive 
study of GIPMNN, PC-GIPMNN, and DRM. GIPMNN fol- 
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Table 10. Results obtained via GIPMNN, PC-GIPMNN, and DRM when compared with FEM for the 2D IAEA benchmark problem. Specif- 
ically, rel denotes the relative error of kett. Especially, DRM is trained with Nepoch = 50000. 


Problem kefrrem) Ketpcipmnn) Kepeiec-ciemnny KeffORM) kl ciemnn) 


KexchPC-GIPMNN) kilorm 


IAEA 0.9688 0.9692 0.9691 


0.9685 4.0370 x 1077 3.0812 x 1077 2.8218 x 10-4 


Table 11. Results obtained via GIPMNN, PC-GIPMNN, and DRM when compared with FEM for the 2D IAEA benchmark problem. ¢"" 
denotes the relative error of the eigenfunction. Especially, DRM is trained with Nepoch = 50000. 


Problem ¢'!crmny) 


rel 
(o (PC-GIPMNN) 


orm 


IAEA 


8.5688 x 10-7 4.7381 x 10-7 8.9602 x 1077 


lows the main idea of the inverse power method to find the 
smallest eigenvalue. The PC-GIPMNN enforce interface con- 
ditions through multiple neurons in the output layer. The con- 
cept of DRM is to define the function of the Rayleigh quotient 
and form an optimization problem. Unlike DRM solving for 
the smallest eigenvalue by directly minimizing the eigenvalue 
(Rayleigh quotient), the GIPMNN and PC-GIPMNN attain 
the smallest eigenvalue using the iterative method. All the 
methods used neural networks to represent functions and the 
differential was implemented using AD. Finally, we applied 
these three methods to problems in reactor physics. 

Three numerical experiments were conducted to verify the 
applicability and accuracy of the GIPMNN, PC-GIPMNN, 
and DRM. In the first 1D example, we used inner and outer 
iterations for the simulation. According to our test, the best 
ratio of outer and inner iterations was 1 : 1. Furthermore, 
we compared the results of the GIPMNN, PC-GIPMNN, 
and DRM with those of the FEM. For the continuous prob- 
lem, the solution learned by the GIPMNN was more accu- 
rate than those learned by the DRM and PC-GIPMNN. For 
interface problem, the eigenvalue and eigenfunction learned 
by PC-GIPMNN were better than that learned by DRM and 
GIPMNN. This is due to the interface conditions that are im- 
plemented in the loss function of PC-GIPMNN. 

In the 2D examples, we observed the failure of DRM on 
the 2D experiments. The DRM can learn the eigenvalue 
and eigenfunction at the early stage of the training process, 
and the results decrease when Nepoch increases. Therefore, 
we selected Nepoch = 50000 to train the DRM and com- 
pare the results obtained by the GIPMNN, PC-GIPMNN, and 
DRM with those obtained by the FEM. The results show that 
the PC-GIPMNN outperforms the GIPMNN and DRM for 
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